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DETAILED ACTION 
Response to Arguments 

Applicant's arguments, see Remarks pages 1-5 and claim amendments, 
filed 10/16/2006, with respect to the rejection(s) of various claim(s) (e.g. 1-7, 14- 
15, 20-22, 25-27, 30, and 32-38) under various statutes have been fully 
considered and are persuasive. 

In the last Office Action, claims 8-10, 13, and 39-41 were allowed; claims 
16-1 7 and 28 were objected to. Examiner has withdrawn the allowability of all 
claims except 8-10 and 13. 

Claims 23-24 are newly canceled, so any rejections against them are not 

valid. 

Claims 42-43 have been added. 

Independent claims currently rejected are 1 , 14, 21 , and 25 (note page 1 
of Remarks), which been amended. 

The rejection of claim 14 under 35 USC 112, second paragraph, stands 
withdrawn in view of applicant's amendment to the claim. 

The rejection of claims 1-7, 14-15, 20-22, 25-27, 30, and 32-38 under 35 
USC 103(a) stand withdrawn in view of applicant's amendments to all the 
outstanding independent claims (all others are dependent upon 1, 14, 21, and 
25). 

However, upon further consideration, a new ground(s) of rejection is made 
in view of various references as below. 
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In answer to certain arguments put forth by applicant on pages 2-3, 
examiner perhaps did not make the point in the clearest possible manner. Figure 
2 of Bui reference shows a plurality of multipliers (M02, M01, and MOO) that 
process given elements of the partial sums. The multiplied, weighted values are 
added by addition units A01 and A00. These are clearly passed down the chain 
of adding elements, where A00 can be viewed as the last element, or 
alternatively the path can continue through A13 and A14 to produce the final 
convolved video output. Therefore, there are a plurality of units connected in a 
series fashion on a per-row basis. The rows are additionally connected in a 
serial fashion. The system of Bui may operate in part in a parallel mode (in that 
each row can be processed simultaneously) but the processing within each row 
and for the overall sum is done in a serial manner. Therefore, this is not a valid 
argument. 

Applicant chooses to argue that certain terms are well known in the art of 
computer graphics, but ignores examiner's statements that the term 'rendered' in 
the art of computer graphics clearly shows that video that is altered is therefore 
'rendered' for display on a monitor, and that a video stream that is processed by 
a computer is thusly 'rendered' for display purposes. 

Examiner submits that while the term 'sample' is known within the 
computer graphics community, does not inherently require 'super-sampling' or 
'multi-sampling', and more to the point generally is regarded as meaning the 
opposite. Therefore, that argument is moot. 
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In response to applicant's argument that the references fail to show certain 
features of applicant's invention, it is noted that the features upon which applicant 
relies (i.e., 'multi-sampling' or 'super-sampling') are not recited in the rejected 
claim(s). Although the claims are interpreted in light of the specification, 
limitations from the specification are not read into the claims. See In re Van 
Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). (Note specifically that 
applicant specifies them in claims 42 and 43, therefore it is clear that the term 
'sample' should be read in broader terms. Therefore, those amendments neatly 
undercut applicant's own arguments to that effect concerning claim 1 (see page 
3)). 

It is noted that the 'means' recited in claim are assumed to be the SM 
ASIC in Figure 8, as discussed on pages 9-10 of the instant specification. 
Applicant has not contented this point, which has been stated in multiple Office 
Actions. Therefore, applicant has conceded this point. 

Claim Rejections - 35 USC § 112 

The following is a quotation of the second paragraph of 35 U.S.C. 1 12: 

The specification shall conclude with one or more claims particularly pointing out and distinctly 
claiming the subject matter which the applicant regards as his invention. 

Claim 39 is rejected under 35 U.S.C. 112, second paragraph, as being 
indefinite for failing to particularly point out and distinctly claim the subject matter 
which applicant regards as the invention. Note that the same error occurred with 
respect to claim 14 and applicant corrected the deficiency therein. 
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Specifically, the claim recites the number 'NT without specifying the range 
of such a number. Therefore, the claim is indefinite. Also, if N were negative, 
that would make no sense. 

Specifically, if there were only one sample manager, it would therefore 
calculate partial sums - but if N is 0, then no partial sums would ever be 
calculated for anything, and the system would not work as designed. 

Therefore, if k were zero, where N could be zero, the system could never 
work in the manner that the claim specifies, because if it were, there would be no 
processing elements or sample elements, but would only be a singular partial 
sums bus, connected to nothing. Other additional logical reasons will be 
discussed at a later time. 

Claims 25, 40, and 41 are all rejected under 35 USC 112, second 
paragraph, for mixing statutory classes of invention as per 77 USPQ2d 1 140, 
IPXL Holdings LLC v. Amazon.com Inc. (CAFC 2005). The claims mix system 
claims with method claims, thusly leaving the metes and bounds of the claims 
unclear. 

Allowable Subject Matter 

Claims 8-10 and 13 are allowed for the reasons discussed in the previous 
Office Actions, since applicant has corrected the deficiencies that they contained 
and they were previously indicated allowable. 

The indicated allowability of claims 16-17, 28, and 39 are withdrawn, and 
claim 20 is still objected to. 
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Claims 40 and 41 would be allowable if rewritten or amended to overcome 
the rejection(s) under 35 U.S.C. 112, 2nd paragraph, set forth in this Office 
action. 

Objections 

Claim 39 is objected to under 37 CFR 1 .75 as being a substantial 
duplicate of claim 16. When two claims in an application are duplicates or else 
are so close in content that they both cover the same thing, despite a slight 
difference in wording, it is proper to object to the other as being a substantial 
duplicate. See MPEP § 706.03(k). 

Definitions 

The term 'rendered' is not defined by applicant's specification. The 
standard definition for 'rendered 1 in the context of computer graphics is that an 
image that is digitized and exists within some type of memory or storage, volatile 
or nonvolatile, is processed and sent to a display device - such as a LCD, CRT, 
and the like. 

This definition is consistent with the intrinsic record. Examiner is giving 
claims their broadest reasonable interpretation as per MPEP 2105. 

An image convolution is inherently a filtering operation in a mathematical 
sense. Further, in mathematics, the definition of a kernel (one definition) of a 
function (e.g. convolution) is: the equivalent relation on the function's domain that 
roughly expresses the idea of "equivalent as far as the function f can tell". 
However, the point is that image convolution represents a function, and that 
function has a kernel in the mathematical sense that conveys the operation of the 
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function. Therefore, any convolution operation will inherently have a kernel. See 
as an example Bui 1:5-2:45. 

Bui, which was previously substituted for Willson, provides a full 
explanation of how convolution is equivalent to filtering and Bui is clearly 
analogous art, since it convolves images. 

Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for 
all obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described 
as set forth in section 1 02 of this title, if the differences between the subject matter sought to 
be patented and the prior art are such that the subject matter as a whole would have been 
obvious at the time the invention was made to a person having ordinary skill in the art to which 
said subject matter pertains. Patentability shall not be negatived by the manner in which the 
invention was made. 

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 
148 USPQ 459 (1966), that are applied for establishing a background for 
determining obviousness under 35 U.S.C. 103(a) are summarized as follows: 

1 . Determining the scope and contents of the prior art. 

2. Ascertaining the differences between the prior art and the claims at 
issue. 

3. Resolving the level of ordinary skill in the pertinent art. 

4. Considering objective evidence present in the application indicating 
obviousness or nonobviousness. 

Claims 1-2, 14-17, 39, and 42-43 are rejected under 35 U.S.C. 103(a) as 

being unpatentable over Wilson (US 5,129,092) in view of Bui et al (US 

4,998,288 A) and Garlick (US 6,614,448 B1). 



As to claim 1 , 

Wilson teaches the following elements, but does not do so completely: 
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A computer graphics system for generating pixels from a distributed convolution 
of rendered samples comprising: (Wilson processes images Wilson is designed 
to process images in 8x8 bit submatrices, as in Abstract, and 2:45-3:4, where an 
image inherently consists of pixels. Wilson performs convolution (18:20-60) on 
sections of an image broken down into smaller data pieces (1:5-20), utilizing a 
plurality of processors 10a-10n, which therefore constitutes 'distributed 
processing') 

-A plurality of sample managers connected in series; and (Wilson clearly teaches 
in Abstract and in Figure 1 a plurality of processor groups each comprising of 
individual processing elements 10a-10h, wherein each group of elements is 
connected to the element adjacent to it using data lines 1 1i-1 1n and register shift 
lines 21i-21n, as set forth in 5:55-6:34, these elements are clearly connected in 
series (as in 2:45-60, where it states that these elements are connected in a 
linear chain, where in Figure 1 it is clear that is input from data input device 20 
over lines 21a and then passed in a one-way manner down the line on lines 
21i-»21n)) 

-A set of partial sums buses, wherein each partial sums bus connects one of the 
sample managers of the series to the next sample manager in the series; (Wilson 
- the term bus is well known in the art to merely mean one or more data transfer 
lines, wherein the data lines 1 1 i-1 1 n and 21 i-21 n clearly move data in byte-size 
chunks, which therefore mean that they are "buses" in the sense meant by 
applicant. Clearly, these buses can transfer partial sums, as in 18:20-60, where 
it is noted that partial sums can be moved along the chain of processors) 
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-Wherein each sample manager is operable to calculate partial sums for a 
corresponding portion of the rendered samples located within a convolution 
kernel corresponding to a pixel location, wherein each sample comprises values 
for a plurality of parameters, wherein partial sums comprise partial sums for each 
sample parameter value, wherein the partial sums comprise 1) a sum of weights 
determined for locations of the rendered samples in the portion of rendered 
samples and 2) a sum of weighted sample values for the portion of rendered 
samples, (Wilson is designed to process images in 8x8 bit submatrices, as in 
Abstract, and 2:45-3:4. Further, in 1 : 10-35 Wilson teaches that inherently, data 
arrays such as images must be broken into smaller data array sizes with 
dimensions equivalent to the size of the processor array. Therefore, the recited 
'convolution kernel' in applicant's specification consists a certain N x N region, as 
noted in Remarks page 1, of which the 8x8 sub-matrix or sub-array of Wilson 
would clearly qualify. Clearly, the resultant element is passed down the 
processor line to be operated upon, which provides a sum of weighted values for 
that portion of samples. Further, in 1 8:20-60 it is clearly explained that the 
system is intended to handle convolutions and/or sums as part of processing 
images, where these can clearly be partial sums. Wilson very clearly teaches 
many common tasks in image filtering, such as transposes (16:40-50), 
accumulation (19:20-60), and the like (16:53-18:20, 18:60-19:50)) 
-Wherein each of the second through the last sample manager in the series is 
operable to add the partial sums calculated for its corresponding portion of the 
rendered samples to any previously accumulated partial sums received from the 
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prior sample manager in the series, and if not the last sample manager in the 
series, output new accumulated partial sums to the next sample manager in the 
series. (Wilson clearly teaches that for the accumulator model, each group of 
processing elements passes the partial sums along towards the right. Clearly, 
the system will take the partial sums calculated for its portion of the same sample 
and output the results to the next group of processing elements in the series) 

Wilson fails to teach several of the limitations. Firstly, Wilson does not 
clearly teach distributed convolution of rendered samples with image kernels; 
that is, Wilson does not expressly say that convolution is happening on the per- 
pixel basis that is "distributed convolution." Bui teaches image convolution - 
1:10-2:30, where the process of convolution, dividing an image into smaller 
kernels and performing filtering is explained; see 3:10-35 where a general- 
purpose digital convolver of the present invention is explained. 

Next, Wilson fails to expressly teach how the plurality of sample managers 
would communicate the accumulated partial sums between the processor arrays 
within in each block per se and how the various partial sums would be 
transmitted between the processor arrays that are in series in Wilson. Bui 
remedies this deficiency. Bui Figure 2 teaches a plurality of sample managers, 
e.g. filter elements in rows, that constitute parallel sums, which are similar to 
those found in Wilson as above, where each element generates a result and 
passes it down the chain - see the multipliers and adders, where each multiplier 
has a preassigned coefficient value - see 3:50-65 - e.g. one of the coefficients of 



Application/Control Number: 10/673,087 Page 11 

Art Unit: 2628 

the 3x3 convolution kernel - e.g. the filter. Bui - clearly the connections between 
each element constitute a sample bus - the original data enters the pipeline and 
is weighted, then is subjected to a delay and placed on the partial sum bus and 
passed down the addition nodes on the accumulator lines. 

Wilson further fails to expressly teach the necessary convolution kernel 
with respect to the image subpart; that is with respect to the weights being 
transmitted between blocks and the like. Bui - Separate regions of the images 
are convolved - 4:34-40 - every pixel in the image is subjected to the 
convolution kernel and serves as the center point in it before processing is 
completed. Bui 2:33-50 provides that video pixel elements that fall within the 
convolution kernel are firstly weighted using multipliers having preassigned 
coefficients, which are then added together. See for example Figure 2 - each 
pixel is firstly sent through a multiplier M02 where it is weighted, then along a 
sample bus 2B through a delay element to an adder A01 . This continues down 
the chain, where each weighted element is added to the bus line and passed 
down. In other words, the coefficients assigned to the multipliers comprise a sum 
of weights for the various elements, and at the end of each branch, a partial sum 
is output. Then, that partial sum from that row is passed along another partial 
sum bus to another delay element H, where the master partial sum bus then 
adds the row output partial sums together and a final value is output as 
convolved video output 210. 

Finally, Wilson fails to teach that the final output is the desired video, that 
is that the final summation is the desired final accumulation of partial sums. Bui 
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Figures 2-4 clearly show that the partial sums produced by any element that is 
not the first in a chain is passed down the partial sums bus, with the output of 
that particular element added to the result already in the partial sums bus, and 
that result is passed down the chain. The last element in the chain outputs the 
convolved video output. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the systems of Wilson and Bui because the 
system of Bui induces less delays and can operate in a faster manner for 
purposes of convolution of specific elements - see for example (5:1-20) that 
avoids the known larger delays in the prior art, such as 4:46-57. Where clearly 
the system of Wilson is 2:45-60 is specified to perform image processing and 
convolution, as well as accumulation operations, the Wilson reference is silent on 
how that actually implements image filtering. The Bui reference clearly performs 
image filtering using weighted partial sums buses as defined above, where in 
1 :63-2:2 it is specified how accumulation processes are used in image 
convolution 

Bui and Wilson both fail to expressly teach that each sample comprises 
values for a plurality of parameters. However, it was well known at the time the 
references were made to use color video, and examiner submits that the very 
passage in Bui cited by applicant as not suggesting color video actually teaches 
the opposite (see page 4 of Remarks). That is, if the video were monochrome or 
grayscale, it is highly unlikely that non-correlated noise would be referred to as 
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'white' if the video stream were grayscale. Examiner further submits that if the 
video were regarded as having different components, handling each set of partial 
sums separately (e.g. keeping the partial sums for different colors) would have 
been obvious in light of that. 

Therefore, the Garlick reference teaches that pixels contain red, green, 
and blue (and optionally alpha) components (1:5-2:25). Therefore, each sample 
(pixel) is described as having a plurality of parameters (e.g. colors). As such, the 
system of Garlick handles each color component separately and provides partial 
sums for each color component to a set of output adders, where (14:24-40) each 
set of partial sums for each color component are handled separately (see Figure 
4B, where adders 41 4A, 41 4B, and 41 4C all have three separate pieces for 
handling each color component. 

For at least the reasons set forth in the Abstract and in 1 :4-3:28, having 
color pixels and thusly using separate buses for each would have been obvious; 
one of ordinary skill at the art at the time the invention was made would have 
been motivated to make the above modifications to Bui and Wilson for at least 
the reasons found within the cited passages. 

As to claim 2, clearly the Wilson/Bui do not calculate normalized values, 
whereas Garlick clearly divides the combined values by a factor to obtain the 
average (e.g. normalized) value (14:25-40), where the previous elements are 
added together and then the final version is divided by a factor of four. 
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As to claim 42, Wilson/Bui do not teach multi-sampling, whereas Garlick 
teaches that multi-sampling is useful and beneficial (1:5-3:25), and motivation is 
taken from the rejection to the parent claim. 

As to claim 43, see the rejection to claim 42 above. Garlick teaches multi- 
sampling, where the term is well known in the computer art to mean an optimized 
version of super-sampling, wherein the hardware is actually made and/or aware 
of the various virtual pixels or subpixels and has wider data paths in order to 
handle it. Again, it is well known to merely be a hardware-optimized super- 
sampling, since super-sampling merely involves taking more than one data point 
within a pixel, which multi-sampling clearly does. 

As to claim 14, 

Wilson partially teaches the limitations below (see below sections for 
explanations as to where it fails). 

A system for distributed filtering of samples comprising: (Wilson processes 
images Wilson is designed to process images in 8x8 bit submatrices, as in 
Abstract, and 2:45-3:4, where an image inherently consists of pixels. Wilson 
performs convolution (18:20-60) on sections of an image broken down into 
smaller data pieces (1 :5-20), utilizing a plurality of processors 10a-1 On, which 
therefore constitutes 'distributed processing') 

-A series of N sample managers (k), wherein k is an integer with range 0 to N-1 , 
and wherein N is an integer greater than 1 ; (Wilson clearly teaches in Abstract 
and in Figure 1 a plurality of processor groups each comprising of individual 
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processing elements 10a-10h, wherein each group of elements is connected to 
the element adjacent to it using data lines 1 1i-1 1n and register shift lines 21i-21n, 
as set forth in 5:55-6:34, these elements are clearly connected in series (as in 
2:45-60, where it states that these elements are connected in a linear chain, 
where in Figure 1 it is clear that is input from data input device 20 over lines 21a 
and then passed in a one-way manner down the line on lines 21 i-»21 n)) 
-A partial sums bus connecting each sample manager (k) in the series of sample 
managers to the next sample manager (k+1 ); (Wilson - the term bus is well 
known in the art to merely mean one or more data transfer lines, wherein the 
data lines 1 1 i-1 1 n and 21 i-21 n clearly move data in byte-size chunks, which 
therefore mean that they are "buses" in the sense meant by applicant. Clearly, 
these buses can transfer partial sums, as in 18:20-60, where it is noted that 
partial sums can be moved along the chain of processors) 
-Receive accumulated partial sums from a prior sample manager (k-1), if k is 
greater than zero, wherein each sample comprises values for a plurality of 
parameters, wherein partial sums comprise partial sums for each parameter 
value, (Wilson is designed to process images in 8x8 bit submatrices, as in 
Abstract, and 2:45-3:4. Further, in 1 : 10-35 Wilson teaches that inherently, data 
arrays such as images must be broken into smaller data array sizes with 
dimensions equivalent to the size of the processor array. Therefore, the recited 
'convolution kernel' in applicant's specification consists a certain N x N region, as 
noted in Remarks page 1 , of which the 8x8 sub-matrix or sub-array of Wilson 
would clearly qualify. Clearly, the resultant element is passed down the 
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processor line to be operated upon, which provides a sum of weighted values for 
that portion of samples. Further, in 18:20-60 it is clearly explained that the 
system is intended to handle convolutions and/or sums as part of processing 
images, where these can clearly be partial sums. Wilson very clearly teaches 
many common tasks in image filtering, such as transposes (16:40-50), 
accumulation (19:20-60), and the like (16:53-18:20, 18:60-19:50) 
-Calculate partial sums for a set of samples, wherein the set of samples are 
within a sub-set of screen space assigned to sample manager (k), and wherein 
the set of samples are located within a convolution kernel defined for a pixel, 
(Wilson, as discussed above, divides the screen into 8x8 submatrices for 
processing, which constitute a sub-set of screen space, which would be assigned 
to each block, as discussed above also, where the block would constitute a 
sample manager (k). Clearly, these are used for accumulation purposes) 
-Add the partial sums to the sample manager (k+1 ), if k is less than N-1 ; and 
(Plainly meaning that if sample manage (k) is the last one in the series (e.g. k = N 
-1 ), no partial sum addition would take place, since there would be no more 
sample managers to send the partial sums to - the addition process would be 
complete)(Wilson clearly teaches that for the accumulator model, each group of 
processing elements passes the partial sums along towards the right. Clearly, 
the system will take the partial sums calculated for its portion of the same sample 
and output the results to the next group of processing elements in the series)(Bui 
Figures 2-4 clearly show that the partial sums produced by any element that is 
not the first in a chain is passed down the partial sums bus, with the output of 
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that particular element added to the result already in the partial sums bus, and 
that result is passed down the chain. The last element in the chain outputs the 
convolved video output). 

-Wherein a designated sample manager is operable to calculate pixel values 
from the final accumulated partial sums (Wilson displays the results of the 
computations on an output display device, as shown as data output device 22 in 
Figure 1 ) 

Wilson fails to teach several of the limitations. Firstly, Wilson does not 
clearly teach distributed convolution of rendered samples with image kernels; 
that is, Wilson does not expressly say that convolution is happening on the per- 
pixel basis that is "distributed convolution" and that it involves an image per se. 
Bui teaches image convolution - 1 :1 0-2:30, where the process of convolution, 
dividing an image into smaller kernels and performing filtering is explained; see 
3:10-35 where a general-purpose digital convolver of the present invention is 
explained. Clearly, image pixels constitute samples. 

Next, Wilson fails to expressly teach how the plurality of sample managers 
would communicate the accumulated partial sums between the processor arrays 
within in each block per se and how the various partial sums would be 
transmitted between the processor arrays that are in series in Wilson. Bui 
remedies this deficiency. Bui Figure 2 teaches a plurality of sample managers, 
e.g. filter elements in rows, that constitute parallel sums, which are similar to 
those found in Wilson as above, where each element generates a result and 
passes it down the chain - see the multipliers and adders, where each multiplier 
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has a preassigned coefficient value - see 3:50-65 - e.g. one of the coefficients of 
the 3x3 convolution kernel - e.g. the filter. Bui - clearly the connections between 
each element constitute a sample bus - the original data enters the pipeline and 
is weighted, then is subjected to a delay and placed on the partial sum bus and 
passed down the addition nodes on the accumulator lines. 

Wilson further fails to expressly teach the necessary convolution kernel 
with respect to the image subpart; that is with respect to the weights being 
transmitted between blocks and the like. Bui - Separate regions of the images 
are convolved - 4:34-40 - every pixel in the image is subjected to the 
convolution kernel and serves as the center point in it before processing is 
completed. Bui clearly teaches that a convolution kernel can be any width - the 
implementation provided of a 3x3 array is merely an example (3:10-45). Bui 
2:33-50 provides that video pixel elements that fall within the convolution kernel 
are firstly weighted using multipliers having preassigned coefficients, which are 
then added together. 

See for example Figure 2 - each pixel is firstly sent through a multiplier 
M02 where it is weighted, then along a sample bus 2B through a delay element 
to an adder A01 . This continues down the chain, where each weighted element 
is added to the bus line and passed down. In other words, the coefficients 
assigned to the multipliers comprise a sum of weights for the various elements, 
and at the end of each branch, a partial sum is output. Then, that partial sum 
from that row is passed along another partial sum bus to another delay element 
H, where the master partial sum bus then adds the row output partial sums 
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together and a final value is output as convolved video output 210. Therefore, 
the subset of screen space would be the 8x8 block of Wilson, and the 
convolution kernel provided by Bui could be applied, since the set of samples is 
not required to be the same size as the set of screen space, although for 
purposes of convenience, it could be. 

Finally, Wilson fails to teach that the final output is the desired video, 
which is the final summation that is the desired final accumulation of partial 
sums. Bui Figures 2-4 clearly show that the partial sums produced by any 
element that is not the first in a chain is passed down the partial sums bus, with 
the output of that particular element added to the result already in the partial 
sums bus, and that result is passed down the chain. The last element in the 
chain outputs the convolved video output. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the systems of Wilson and Bui because the 
system of Bui induces less delays and can operate in a faster manner for 
purposes of convolution of specific elements - see for example (5:1-20) that 
avoids the known larger delays in the prior art, such as 4:46-57. Where clearly 
the system of Wilson is 2:45-60 is specified to perform image processing and 
convolution, as well as accumulation operations, the Wilson reference is silent on 
how that actually implements image filtering. The Bui reference clearly performs 
image filtering using weighted partial sums buses as defined above, where in 
1 :63-2:2 it is specified how accumulation processes are used in image 
convolution 
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Bui and Wilson both fail to expressly teach that each sample comprises 
values for a plurality of parameters. However, it was well known at the time the 
references were made to use color video, and examiner submits that the very 
passage in Bui cited by applicant as not suggesting color video actually teaches 
the opposite (see page 4 of Remarks). That is, if the video were monochrome or 
grayscale, it is highly unlikely that non-correlated noise would be referred to as 
'white' if the video stream were grayscale. Examiner further submits that if the 
video were regarded as having different components, handling each set of partial 
sums separately (e.g. keeping the partial sums for different colors) would have 
been obvious in light of that. 

Therefore, the Garlick reference teaches that pixels contain red, green, 
and blue (and optionally alpha) components (1 :5-2:25). Therefore, each sample 
(pixel) is described as having a plurality of parameters (e.g. colors). As such, the 
system of Garlick handles each color component separately and provides partial 
sums for each color component to a set of output adders, where (14:24-40) each 
set of partial sums for each color component are handled separately (see Figure 
4B, where adders 41 4A, 414B, and 414C all have three separate pieces for 
handling each color component. 

For at least the reasons set forth in the Abstract and in 1 :4-3:28, having 
color pixels and thusly using separate buses for each would have been obvious; 
one of ordinary skill at the art at the time the invention was made would have 
been motivated to make the above modifications to Bui and Wilson for at least 
the reasons found within the cited passages. 
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As to claim 15, dearly each group of elements in Wilson has the 
corresponding memory across to each group of processing elements in series - 
see Figure 1 as an example. 

As to claims 16 and 39, clearly the only improvement is providing a 
plurality of memory units, where this is a simple duplication of parts. There is no 
claimed benefit to doing so - In re Harza, 274 F.2d 669, 124 USPQ 378 (CCPA 
1960). Therefore, it would have been obvious to one of ordinary skill in the art to 
use plural memories for the reasons cited therein. 

As to claim 17, it would have been obvious that if each sample manager 
has a memory dedicated to storing sample data only for samples in the subset of 
screen space assigned to it that the sample manager would be able to read from 
the memory it writes to. Examiner takes Official Notice of that fact that it is well 
known in the art that a memory that is written to by a computer is also readable, 
and that for the sample manager to perform calculations, it must be able to write 
new data values to the memory once the calculations have taken place. The 
motivation to do so would be that without functional memory a processor would 
not work. 

Claims 3, 21 , 25-27, 30, 32-33, and 35-37 are rejected under 35 
U.S.C. 103(a) as being unpatentable over Wilson in view of Bui and Garlick as 
applied to claims 1 and 2 above, and further in view of Inada et al (US 
2004/0004620 A1 X'lnada'). 
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As to claim 3, the system of claim 1 , wherein for each sample manager 
the corresponding portion of samples resides in a sub-set of screen space and 
the sub-sets are finely interleaved across screen space. Clearly the system of 
Inada establishes in [0154] that the system breaks the screen down into blocks of 
4x4 pixels for interleaving, which constitutes a distribution across screen space, 
and further in Fig. 1 it is shown how the screen is divided into smaller areas, 
where each area is analyzed for the presence of a primitive in the pixels in that 
particularly, smaller area. Further, Wilson teaches the division of an image into 
blocks or sub-arrays for processing and convolution purposes. That being said, 
It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to combine the systems of Wilson and Bui/Garlick for the 
reasons set forth above (the motivation and combination of claim 1 are herein 
incorporated by reference) with the system of Inada, to allow interleaving as that 
technique speeds up drawing time (Inada [0155]). 

As to claim 21 , the rejection to claim 1 is incorporated by reference. 

Wilson teaches the claimed filter unit as recited is clearly comparable to 
the sample managers recited in previous claims, as the functionality is the same, 
and the processing element would clearly be performing similar tasks. Each of 
the N memories recited is attached to a group of processing elements, which 
serves as a filtering element / sample manager / generic processor, as set forth 
in Wilson Fig. 1. Each unit of Willson could contain a convolution kernel, with the 
individual taps of the filter being the multipliers of Bui and the like. Wilson does 
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not expressly teach that the individual filter taps are elements within the blocks, 
but Bui does. 

As set forth in the preceding paragraph, each processor 10a-10h in the 
processing group in Wilson obviously reads from its own memory that contains 
the section of the image assigned to (as in Inada) for convolution purposes (or 
Wilson), performs partial sum calculations on it, and moves it into the next 
element in the linearly connected array (Wilson). The entire question of partial 
sums and their calculations is covered in the sections of the rejections of claim 1 
that has been expressly incorporated via reference and will not be repeated for 
the purposes of brevity. 

Next, the recited N must clearly be at least 2 or the set of N-1 partial sum 
buses would be an empty set and only one processor would exist; further, the 
last clause refers to the last filter unit (N-1 ), where with a filter unit number zero, if 
the condition (N = k = 1) held, then the condition on the last line of the filter unit 
would not hold (e.g. k = 1 > (N-1) = 0), which would require a second filter unit. 
That is to say, the entire manner in which the claim is written requires that N=2, 
since a sequence by definition requires two or more members (e.g. the following 
of one thing after another, etc) 

Obviously, the system of Wilson has one or more memory per processing 
group as connected in Figure 1 - see Wilson 5:55-6:45. Each unit of Bui is an 
individual tap of the filter or similar implementation. The recited numeric 
limitations - that of N and k, are obvious in that any chain of processors would 
have each processor numbered as set forth in the claim, with the respective 
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limitations, in that, for example, the first processor in the chain would be 
numbered zero, and of course the first processor would prima facie not receive 
data from a previous processor as it was the first one in the processor chain. 

Wilson and Bui fail to teach that the recited system breaks the screen 
down into blocks of screen space with a distribution. Clearly the system of Inada 
establishes in [0154] that the system breaks the screen down into blocks of 4x4 
pixels for interleaving, which constitutes a distribution across screen space, and 
further in Fig. 1 it is shown how the screen is divided into smaller areas, where 
each area is analyzed for the presence of a primitive in the pixels in that 
particularly, smaller area, and in Fig. 7 the screen is shown to be divided into 2x2 
bins or tiles containing samples for processing purposes. Further, Wilson teaches 
the division of an image into smaller arrays, as in Wilson is designed to process 
images in 8x8 bit submatrices, as in Abstract, and 2:45-3:4. Further, in 1:10-35 
Wilson teaches that inherently, data arrays such as images must be broken into 
smaller data array sizes with dimensions equivalent to the size of the processor 
array. 

One additional note is that the system of Inada as shown in Fig. 4 clearly 
shows a plethora of operations units attached to each register (e.g. operation 
units 1411-1,1412-1, etc. attached to registers such as 1 41 1 -2), which clearly 
establishes multiple processing / operations units attached to memories in the 
first place. 

Obviously, the system of Inada outputs pixels as set forth in paragraphs 
[0022-0024]. Clearly, the results of all the graphics calculations and convolutions 
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would be passed out as pixel data as shown there, and it is logical that the end 
results of an image convolution calculation from a chain would indeed be output 
as pixels - indeed, an image is fundamentally composed of pixels, and it is a 
fundamental of the digital signal processing art that an image is output in pixels 
from being processed in this context. 

Motivation and combination is taken from the rejection to claim 1, which is 
incorporated by reference, and from the additional logic as set forth above. 
Inada brings in the benefits of explaining how the screen space is subdivided so 
that such an array can thusly more efficiently process all the information provided 
from the subdivision of the screen space, as set forth in the cited paragraphs. 
Motivation / rationale is also taken from the rejection to claim 3 above. 

As to claim 25, it is merely a method implementing the system of claim 21 , 
and the rejection to claim 21 is valid upon it without further comment. 

As to claim 26, see the rejection to claim 3 above, which addresses 
regions having defined boundaries, wherein the screen space is divided as set 
forth there. Wilsonand Bui fail to teach this limitation. Inada [0020] discusses 
how each region is judged with respect to its center point, which clearly 
establishes that this is an obvious variation. Motivation and combination are 
taken from the parent claim and incorporated herein by reference in their entirety. 

As to claim 27, this limitation is expressly covered in the rejection of claim 
1 , the relevant portion of which is incorporated by reference, and is also stated 
below. Wilson is designed to process images in 8x8 bit submatrices, as in 
Abstract, and 2:45-3:4. Further, in 1:10-35 Wilson teaches that inherently, data 
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arrays such as images must be broken into smaller data array sizes with 
dimensions equivalent to the size of the processor array. Therefore, the recited 
'convolution kernel* in applicant's specification consists a certain N x N region, as 
noted in Remarks page 1 , of which the 8x8 sub-matrix or sub-array of Wilson 
would clearly qualify. Clearly, the resultant element is passed down the 
processor line to be operated upon, which provides a sum of weighted values for 
that portion of samples. Further, in 18:20-60 it is clearly explained that the 
system is intended to handle convolutions and/or sums as part of processing 
images, where these can clearly be partial sums. Wilson very clearly teaches 
many common tasks in image filtering, such as transposes (16:40-50), 
accumulation (19:20-60), and the like (16:53-18:20, 18:60-19:50). 

However, Wilson fails to expressly teach that data is passed down the 
processor chain as partial sums data in the process required for filtering in the 
context of the instant application. Bui clearly teaches that the data is passed 
down the adder / processing element chain and that it consists of partial sums as 
data in Figure 3 and 5:65-6:20. Bui is a filter, where this kind of filter inherently 
consists of weight functions that are 'partial sums', where each tap in such a filter 
very clearly causes a partial sum as the outcome, see for example the weights 
on the multipliers in Figures 1-4. Bui performs accumulation, as discussed in 
1 :65-2:10. Motivation and combination is taken from the rejection to claim 25 as 
above. 

As to claim 30, this is an obvious variation and is addressed in the 
rejection to claim 21 and is repeated herein. Obviously, the system of Inada 
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outputs pixels as set forth in paragraphs [0022-0024]. Clearly, the results of all 
the graphics calculations and convolutions would be passed out as pixel data as 
shown there, and it is logical that the end results of an image convolution 
calculation from a chain would indeed be output as pixels - indeed, an image is 
fundamentally composed of pixels, and it is a fundamental of the digital signal 
processing art that an image is output in pixels from being processed in this 
context. Motivation and combination are incorporated by reference from the 
parent claim. 

As to claim 32, Wilson and Bui fail to address and Inada clearly addresses 
this limitation wherein it would be obvious to divide the screen up into bins and 
assign each one to an FPGA or processing element or filtering element, 
whatever the generic terminology for the groups of Wilson or the elements of Bui. 

As to claim 33, Wilson and Bui fail to teach this limitation, where clearly 
the system of Inada establishes in [0154] that the system breaks the screen 
down into blocks of 4x4 pixels for interleaving, which constitutes a distribution 
across screen space, and further in Fig. 1 it is shown how the screen is divided 
into smaller areas, where each area is analyzed for the presence of a primitive in 
the pixels in that particularly, smaller area. Further, Wilson teaches the division of 
an image into blocks for processing and convolution purposes. That being said, 
It would have been obvious to one having ordinary skill in the art at the time the 
invention was made to combine the systems of Wilson and Bui for the reasons 
set forth above (the motivation and combination of claim 2 are herein 
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incorporated by reference) with the system of Inada, to allow interleaving as that 
technique speeds up drawing time (Inada [0155]). 

As to claim 35, clearly the system of Wilson has filter units divided into a 
plurality of smaller processing units, where this is shown in Figure 1 , with the 8 
processor configuration as described there for handling 8x8 arrays such as those 
shown in Figures 6A and 6B. Wilson does not expressly teach that these 
constitute a filter, but the system of Bui clearly provides for filtering as explained 
in the rejection to claim 25, which is incorporated by reference. 

As to claim 36, this is a trivially obvious variant of claim 30 is subject to the 
same rejection. 

As to claim 37, this is a trivially obvious variant of claim 33 and is subject 
to the same rejection. 

Claims 4 and 34 are rejected under 35 U.S.C. 103(a) as unpatentable 
over Wilson, Bui, Garlick, and Inada as applied to claim 3 above, and further in 
view of Hsieh et al (US 6,81 9,321 B1 ). 

As to claim 4, Wilson and Bui do not expressly teach these limitations; 
Inada clearly teaches dividing the screen into a plurality of bins but does not 
specifically teach sixteen sample managers and a four by four array of bins. 
Reference Hsieh et al teaches dividing the screen into a number of bins for two- 
dimensional image processing, where the number can be arbitrary, but where an 
example given is four bins in Figure 4 (3:28-40). Applicant has not established 
any criticality to the number of sample managers and/or bins, and as such the 
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choice of an arbitrary number of 'sample managers' or processing elements and 
the division of the screen into some arbitrary number of bins is a matter of design 
choice, see In re Harza, 274 F.2d 669, 124 USPQ 378 (CCPA 1960). It would 
have been obvious to one of ordinary skill in the art at the time the invention was 
made to modify Inada, Wilson, and Bui to use arbitrarily scaled bins as per Hsieh, 
since Hsieh decreases memory bandwidth required and provides numerous 
other benefits (see 2:20-40 and Abstract). 

As to claim 34, it is identical to claim 4, and the rejection to which is 
incorporated by reference. 

Claims 5, 7, and 37 are rejected under 35 U.S.C. 103(a) as unpatentable 
in view of Wilson and Bui/Garlick as applied to claim 1 , and further in view of 
Cloutier (5,892,962 A). 

As to claim 5, Wilson and Bui do not expressly teach this limitation. 
Cloutier clearly teaches a plurality of groups of FPGAs in Figure 1 , with these 
controllable by the SIMD process controller. Cloutier Fig. 1 clearly illustrates a 
plurality of FPGAs configured in a matrix connection, all with global bus 
connections, and Fig. 3 illustrates similar connections between PEs on one 
,FPGA. It is notoriously well known that an FPGA can be configured to emulate 
any other type of processor, e.g. the system of Wilson/Bui as noted above. 
Cloutier teaches that such a system works quickly and is more efficient for 
processing images and the like. Cloutier clearly establishes that as set forth 
above that each PE performs convolution based on weighted partial sum 
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operations. Furthermore, the nature of convolution is such that once partial sums 
are computed, they must be acted on by other elements or processors to 
produce the final, desired results. 

As such, examiner takes the position that explanations cited above in 
response to each element of the portion of the claim dealing with the computation 
and/or calculation of partials sums more than adequately meet all of the 
limitations set forth by that section of the claim. Further, any PE that received 
partials sums from the bus would clearly add them using the multiply-accumulate 
operations cited above, particularly in the case of a neural network that was 
being used to perform convolution, which would be obvious to do since the 
system of Cloutier clearly has established utility for performing both tasks, and 
optical character recognition (OCR), which requires convolution and pre- 
processing. Further, Cloutier clearly teaches the applicability of his system to 
image processing in 4:20-35. Cloutier 7:26-55 again, where it is well known that 
the partial sums must be added, and 4:20-35, where it is taught that the present 
embodiment is well suited for matrix and vector addition and multiplication. More 
specifically, the embodiment of Cloutier is taught to perform multiply-accumulate 
operations (8:50-9:15), which clearly requires that the network of processing 
elements perform multiply-accumulate operations per tile (with each processing 
element performing said operations), and in a neural network application, which 
provides feed-forward information (e.g. feedback) for pattern recognition and 
similar, multiply-accumulate operations used in the processing of an image would 
obviously be added and passed along, as the architecture of Cloutier as shown in 
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Fig. 1 is such that data is passed along between elements in the additive fashion 
as set forth above. Also, the system performs convolution and uses partial sums, 
which clearly requires that the accumulated partial sums be passed to other 
elements. Cloutier clearly establishes in 7:26-55 that the system has many 
processing element, each of which computes its own partial sums for convolution 
purposes. 

In light of all of the above, it would have been obvious to one of ordinary 
skill in the art at the time the invention was made to combine the system of 
Wilson/Bui with that of Cloutier, such that each FPGA emulated the system of 
Wilson/Bui and therefore allowed each section to handle one portion of screen 
space, as this parallel computation would inherently be faster since the parts 
would be duplicated and the net speed would increase. This is notoriously well 
known in the art. 

As to claim 7, this is a duplicate of claim 2, the rejection to which is 
incorporated by reference. 

As to claim 37, this is a duplicate of claim 5, the rejection to which is 
incorporated by reference in addition to the rejection to claim 37 above. 

Claim 6 is rejected under 35 U.S.C. 103(a) as unpatentable over 
Wilson/Bui/Garlick in view of Cloutier as applied to claim 5 above, and further in 
view of Hsieh. 

Therefore, the rejection of claims 2 and 5 are incorporated by reference, 
and motivation and rationale are taken from each of them. 
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As to claim 6, clearly all of the processing elements in Fig. 1 and Fig. 3 of 
Cloutier are clearly connected via the global bus anyway, which clearly meets the 
requirements that all the sample manager be chained, and that the final member 
calculates pixel values - clearly each chip is computing pixel values from the 
results of the prior one - see Cloutier 7:7-67. Further, as explained above in the 
rejection to claim 1 , the FPGAs and the PEs within each FPGA are all 
interconnected, and the PEs and the FPGAs can clearly be connected in a chain 
or serial fashion, as the very nature of an FPGA is that the blocks can be set to 
have any desired set of connections with bidirectional or unidirectional 
communications. 

Reference Hsieh et al teaches dividing the screen into a number of bins 
for two-dimensional image processing, where the number can be arbitrary, but 
where an example given is four bins in Figure 4 (3:28-40). Applicant has not 
established any criticality to the number of sample managers and/or bins, and as 
such the choice of an arbitrary number of 'sample managers' or processing 
elements and the division of the screen into some arbitrary number of bins is a 
matter of design choice, see In re Harza, 274 F.2d 669, 124 USPQ 378 (CCPA 
1 960). It would have been obvious to one of ordinary skill in the art at the time 
the invention was made to modify Inada, Wilson, and Bui to use arbitrarily scaled 
bins as per Hsieh, since Hsieh decreases memory bandwidth required and 
provides numerous other benefits (see 2:20-40 and Abstract). 

References Wilson and Bui do not expressly teach this limitation, whilst 
Reference Cloutier teaches in 7:28-40 specifically that a matrix of 8x4 PEs is 
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implemented on each FPGAs where there is a 2x2 array of FPGAs in the first 
place, and Inada provides additional support. Given this, it would be reasonable 
to use a 4x4 array instead of an 8x4 array, given that each FPGA could easily be 
partitioned into a 4x4 array, as illustrated in Figure 3 anyway. Now, as set forth 
in the rejections to claims 1-3 above, the system of Cloutier is taught for use with 
image convolution, and further in Inada the use of interlaced scans is taught, 
such that the screen is divided up into units of say 4x4 pixels for faster drawing 
time. Therefore, if one FPGAs with a 4x4 array of PEs, or a 2x2 array of FPGAs 
with a 2x2 PE implementation, with each one dedicated processing a certain 
portion of the screen was used, and interleaving was used for the results, it 
would logical to use the claimed 4x4 architecture. Motivation and combination is 
taken from the parent claim and herein incorporated by reference, with additional 
motivation as set forth in the immediately preceding paragraph. 

Claim 22 is rejected under 35 U.S.C. 103(a) as unpatentable over Wilson, 
Bui, Garlick, and Inada as applied to claim 21 above, and further in view of Hsieh 
etal (US 6,819,321 B1). 

As to claim 22, references Wilson and Bui do not teach this limitation; 
Inada clearly teaches dividing the screen into a plurality of bins but does not 
specifically teach sixteen sample managers and a four by four array of bins. 
Reference Hsieh et al teaches dividing the screen into a number of bins for two- 
dimensional image processing, where the number can be arbitrary, but where an 
example given is four bins in Figure 4 (3:28-40). Applicant has not established 
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any criticality to the number of sample managers and/or bins, and as such the 
choice of an arbitrary number of 'sample managers' or processing elements and 
the division of the screen into some arbitrary number of bins is a matter of design 
choice, see In re Harza, 274 F.2d 669, 124 USPQ 378 (CCPA 1960). It would 
have been obvious to one of ordinary skill in the art at the time the invention was 
made to modify Inada, Wilson, and Bui to use arbitrarily scaled bins as per Hsieh, 
since Hsieh decreases memory bandwidth required and provides numerous 
other benefits (see 2:20-40 and Abstract). 

Claim 28 is rejected under 35 USC 103(a) as unpatentable over Wilson, 
Bui, Gariick, and Inada as applied to claim 27 above, and further in view of Vetro 
etal (US 6,266,443 B1). 

As to claim 28, WBGI fail to teach the use of Gaussian filters or the like. 
Vetro clearly teaches the use of a Gaussian filter type convolution kernel to 
sharpen edges in an image (5:64-67). Bui clearly taught the use of the filter / 
convolution kernel to remove noise (note applicant's quotation of same on page 3 
of Remarks). Therefore, it would have been obvious to one of ordinary skill in the 
art at the time the invention was made to use a Gaussian filter since it allows for 
sharpening of edges within an image, as is well known in the art. 

Claim 38 is rejected under 35 U.S.C. 103(a) as unpatentable over 
Wilson/Bui/Garlick/lnada in view of Cloutier and Hsieh. 
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Therefore, the rejection of claims 25 and 5/6 are incorporated by 
reference, and motivation and rationale are taken from each of them. 

WBGI fail to expressly teach the stated arrays and bin configurations. 

As to claim 38, clearly all of the processing elements in Fig. 1 and Fig. 3 of 
Cloutier are clearly connected via the global bus anyway, which clearly meets the 
requirements that all the sample manager be chained, and that the final member 
calculates pixel values - clearly each chip is computing pixel values from the 
results of the prior one - see Cloutier 7:7-67. Further, as explained above in the 
rejection to claim 1, the FPGAs and the PEs within each FPGA are all 
interconnected, and the PEs and the FPGAs can clearly be connected in a chain 
or serial fashion, as the very nature of an FPGA is that the blocks can be set to 
have any desired set of connections with bidirectional or unidirectional 
communications. 

Reference Hsieh et al teaches dividing the screen into a number of bins 
for two-dimensional image processing, where the number can be arbitrary, but 
where an example given is four bins in Figure 4 (3:28-40). Applicant has not 
established any criticality to the number of sample managers and/or bins, and as 
such the choice of an arbitrary number of 'sample managers' or processing 
elements and the division of the screen into some arbitrary number of bins is a 
matter of design choice, see In re Harza, 274 F.2d 669, 124 USPQ 378 (CCPA 
1960). It would have been obvious to one of ordinary skill in the art at the time 
the invention was made to modify Inada, Wilson, and Bui to use arbitrarily scaled 
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bins as per Hsieh, since Hsieh decreases memory bandwidth required and 
provides numerous other benefits (see 2:20-40 and Abstract). 

References Wilson and Bui/Gaiiick/lnada do not expressly teach this 
limitation, whilst Reference Cloutier teaches in 7:28-40 specifically that a matrix 
of 8x4 PEs is implemented on each FPGAs where there is a 2x2 array of FPGAs 
in the first place, and Inada provides additional support. Given this, it would be 
reasonable to use a 4x4 array instead of an 8x4 array, given that each FPGA 
could easily be partitioned into a 4x4 array, as illustrated in Figure 3 anyway. 
Now, as set forth in the rejections to claims 1-3 above, the system of Cloutier is 
taught for use with image convolution, and further in Inada the use of interlaced 
scans is taught, such that the screen is divided up into units of say 4x4 pixels for 
faster drawing time. Therefore, if one FPGAs with a 4x4 array of PEs, or a 2x2 
array of FPGAs with a 2x2 PE implementation, with each one dedicated 
processing a certain portion of the screen was used, and interleaving was used 
for the results, it would logical to use the claimed 4x4 architecture. Motivation 
and combination is taken from the rejections to claim 5 and is herein incorporated 
by reference, with additional motivation as set forth in the immediately preceding 
paragraph. 

Conclusion 

Any inquiry concerning this communication or earlier communications from 
the examiner should be directed to Eric Woods whose telephone number is 571- 
272-7775. The examiner can normally be reached on M-F 7:30-5:00. 
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If attempts to reach the examiner by telephone are unsuccessful, the 
examiner's supervisor, Ulka Chauhan can be reached on 571-272-7782. The fax 
phone number for the organization where this application or proceeding is 
assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from 
the Patent Application Information Retrieval (PAIR) system. Status information 
for published applications may be obtained from either Private PAIR or Public 
PAIR. Status information for unpublished applications is available through 
Private PAIR only. For more information about the PAIR system, see http://pair- 
direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll- 
free). If you would like assistance from a USPTO Customer Service 
Representative or access to the automated information system, call 800-786- 
9199 (IN USA OR CANADA) or 571-272-1000. 
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